Efficient Index Maintenance Under Dynamic Genome Modification

نویسندگان

  • Nitish Gupta
  • Komal Sanjeev
  • Tim Wall
  • Carl Kingsford
  • Robert Patro
چکیده

E cient text indexing data structures have enabled large-scale genomic sequence analysis and are used to help solve problems ranging from assembly to read mapping. However, these data structures typically assume that the underlying reference text is static and will not change over the course of the queries being made. Some progress has been made in exploring how certain text indices, like the su x array, may be updated, rather than rebuilt from scratch, when the underlying reference changes. Yet, these update operations can be complex in practice, di cult to implement, and give fairly pessimistic worst-case bounds. We present a novel data structure, SkipPatch, for maintaining a k-mer-based index over a dynamically changing genome. SkipPatch pairs a hash-based k-mer index with an indexable skip list that is used to e ciently maintain the set of edits that have been applied to the original genome. SkipPatch is practically fast, significantly outperforming the dynamic extended su x array in terms of update and query speed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Index Maintenance for Moving Objects with Future Trajectories

Recently, more research has been conducted on moving object databases (MOD). Typically, there are three kinds of data for dynamic attributes in MOD, i.e., historical, current and future. Although many index structures have been developed for the former two types of data, there is not much work to deal with the future data. In particular, the problem of index update has not been addressed with e...

متن کامل

Performance Evaluation of SSD-Index Maintenance Schemes in IR Applications

— With the advent of flash memory based new storage device (SSD), there is considerable interest within the computer industry in using flash memory based storage devices for many different types of application. The dynamic index structure of large text collections has been a primary issue in the Information Retrieval Applications among them. Previous studies have proven the three approaches to ...

متن کامل

Integrated Inspection Planning and Preventive Maintenance for a Markov Deteriorating System Under Scenario-based Demand Uncertainty

In this paper, a single-product, single-machine system under Markovian deterioration of machine condition and demand uncertainty is studied.  The objective is to find the optimal intervals for inspection and preventive maintenance activities in a condition-based maintenance planning with discrete monitoring framework. At first, a stochastic dynamic programming model whose state variable is the ...

متن کامل

Influences of Track Structure, Geometry and Traffic Parameters on Railway Deterioration

The roles of the several parameters that influences railway track deterioration most, are examined in this research with a view to make railway track maintenance more effective and cost efficient. The results presented are based on a comprehensive study of railway track degradation on super structure, sub-structure and geometrical aspects. The changes in TQI (Track Quality Index), the track set...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1604.03132  شماره 

صفحات  -

تاریخ انتشار 2016